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Abstract. Let "H be a given (possibly empty) family of connected graphs, each containing a cycle, 
and let G be a arbitrary finite "H-free graph with minimum degree at least k. For p £ [0, 1], we 
form the p-random subgraph Gp of G by independently keeping each edge of G with probability 
p. Extending a classical result of Ajtai, Komlos, and Szemeredi, we prove that for every positive 
e, there exists a positive 5 (depending only on e) such that the following holds; If p ^ then 
with probability tending to 1 as A; —> cx3, the random graph Gp contains a cycle of length at least 
5-nu{k), where riuik) > k is the minimum number of vertices in an "H-free graph of average degree 
at least k. Thus in particular Gp as above typically contains a cycle of length at least linear in k. 

1. Introduction 

Given a graph G and a real number p G [0, 1], we define the p-random subgraph of G, denoted 
Gp, to be the random subgraph of G such that each edge of G belongs to Gp with probability p, 
independently of all other edges. The most studied case of the above model is when G is a complete 
graph. This particular model, usually denoted G{n,p), where n is the number of vertices in the 
base (complete) graph, was first introduced in [BJ and has since become one of the most popular 
objects of study in combinatorics. 

In the groundbreaking paper of Erdos and Renyi f5], the following fundamental discovery was 
made: if we let p{n) gradually increase from to 1, then the connectivity structure of the random 
graph G{n,p) undergoes a dramatic phase transition around p{n) = ^. For any positive constant 
e, if p(n) ^ then asymptotically almost surelj0 (a.a.s.), the size of each connected component 
of G{n,p) is at most logarithmic in n, whereas if p{n) ^ then a.a.s. G{n,p) has a unique 
component of linear size, traditionally called the giant component. The paper of Erdos and Renyi 
has had an enormous influence on the development of the theory of random graphs. Its main 
results have been given several different proofs and extended or improved in many different ways. 
For a detailed account of the theory of random graphs, we refer the reader to the two standard 
monographs [31 [9] . 

One of the better known extensions of the main result of [5] is due to Ajtai, Komlos, and 
Szemeredi [l], who proved that if p{n) ^ then not only the random graph G{n,p) a.a.s. 

contains a giant component occupying a positive proportion of all the vertices, but also it typically 
has a path of length linear in n. An easy corollary of this fact is that if p(n) ^ then a.a.s. 
G{n,p) contains a cycle of length linear in n. 
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The following natural generalizations of the classical results about the evolution of the random 
graph mentioned above were recently considered in [6l[ini[II]- Suppose that pq: 'N ^ [0,1] is a 
threshold function for some (monotone) graph property V in G{n,p). For example, one might let 
Po{n) = ^ and V be the property of containing a connected component (or a path / cycle) of 
size (length) linear in n. In particular, suppose that if p ^ (1 + ^)Po for some positive constant 
e, then a.a.s. G{n,p) possesses V. Does this statement remain true if one replaces G{n,p) with 
the p-random subgraph Gp of an arbitrary graph G with minimum degree n — 1? This has been 
answered in the affirmative in several cases, e.g., when V is the property of being non-planar |6J, 
containing a path of length linear in n or of length at least n — 1 [lOJ, and having a cycle of 
length n — o(n) [TOj. As a by-product, more robust proofs of the corresponding statements in the 
case G = Kn were obtained. 

In this paper, we continue this study and introduce one additional twist. Namely, we fix an 
arbitrary family % of graphs and further assume that the base graph G is "H-free, i.e., that G does 
not contain a copy of any H £ 71 as a subgraph. Since we allow the family T-L to be empty, which 
imposes no additional restrictions on G, our results will generalize some previous works. Our aim 
is to prove that if G is an H-iree graph with minimum degree at least k (from now on, we will use 
k to denote the lower bound on the minimum degree and n to denote the number of vertices of 
the graph G) and p ^ then with probability approaching 1 as /c — >• cxd, the random graph Gp 
contains a long path and a long cycle. Since we are only interested in the asymptotic behavior of 
these probabilities, we will assume there exist Ti-hee graphs with arbitrary large minimum degree. 
This implies, in particular, that the family H cannot contain any acyclic graphs. Moreover, we will 
assume for convenience (see Lemma l2.3p that every graph in the family Ti is connected. We will 
term such families Ti good. Finally, we will assume throughout the paper that the base graph G is 
finite. 

Recall that the Turdn number for Ti, denoted ex{n,T-L), is the maximum number of edges in an 
Ti-ficee graph on n vertices and observe that if G is an T-L-bee graph with minimum degree at least 
k, then the number n of vertices of G satisfies 



It is immediate that ([T|) imposes a lower bound on n. Namely, it implies that n ^ n-}{{k), where 
n-u{k) is the smallest number of vertices in an Ti-fiee graph of average degree k. Our main result. 
Theorem 11.11 below, is that, under our assumptions on G and p, with probability very close to 1, 
the random graph Gp contains a cycle of length at least 6 ■ n-}i(k) for some positive constant 5 that 
depends only on e, which is clearly optimal up to the value of 6. Note that if the family T-L does not 
contain any bipartite graph, then /c + 1 ^ ny^{k) ^ 2k and consequently, the length of the longest 
cycle we can hope to find in Gp is only linear in k. Therefore, the interesting cases will be either 
when T-L is empty or when it contains at least one bipartite graph. 

Theorem 1.1. For every positive e, there exists a positive constant 6 such that the following is 
true. Let k be a sufficiently large integer, let % be a good family of graphs, and let t be an integer 
satisfying 



nk ^2 ■ ex(n, 7i). 



(1) 



^ 6k. 



(2) 
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If G is an H-free graph with minimum degree at least k and p ^ then 

Pr (Gp contains a cycle of length at least £) ^ 1 — exp(— 5A;). 

Remark 1.2. Since ?^ is a good family, by well-known results from extremal graph theory ex(^, %) ^ 
(i^+'n fQj- some "q = r]{T-L) > 0, and thus condition [2] places an upper bound on I, for a given k. 

When we let % be the empty family, we obtain the following corollary. 

Corollary 1.3. For every positive e, there exists a positive constant 5 such that the following is 
true. Let k he a sufficiently large integer and let G he an arhitrary graph with minimum degree at 
least k. If p ^ then 

Pr (Gp contains a cycle of length at least 6k^ ^ 1 — exp{—6k). 

We remark that the statement obtained from Corollary 11.31 by replacing 'a cycle of length 5k'' 
with 'a path of length e^A;/5' was proved in ^1]. Still, when G is an arbitrary graph, one cannot 
easily deduce Corollary 11.31 from its 'path version' using a standard double exposure (sprinkling) 
argument as, unlike the case when G is a complete graph, there is no guarantee that G contains 
any edges that close a given long path into a cycle. 

Another interesting corollary of Theorem 11.11 is that every "H-free graph with minimum degree k 
contains a cycle of length at least linear in n-}{{k). It seems, perhaps somewhat surprisingly, that 
such a statement has not been proved before. We were unable to find a short proof of it that would 
not employ the famous extension-rotation technique of Posa [13j. 

The case which seems especially interesting is that when T-L is the family of all cycles of lengths 
ranging from 3 to some g. Note that in this case, requiring a graph to be H-free is the same as 
requiring that its girth exceeds g. Since if g is even, then ex{n,Cg) = 0(n^"^^/^), as proved by 
Bondy and Simonovits [3], Theorem 1 1 . 1 1 has the following nice corollary. 

Corollary 1.4. For every g G {2,3,...} and every positive e, there exists a positive constant 6 
such that the following is true. Let k is a sufficiently large integer and let G be an arhitrary graph 
with minimum degree at least k and girth larger than g. If p ^ then 

Pr (Gp contains a cycle of length at least (^fc'-9/2j^ ^ 1 — exjp{—6k). 

Observe that Corollarv 11.41 remains true when one replaces the assumption that the girth of G 
is larger than g with the weaker assumption that G does not contain a cycle of length 2[|J. 

It is perhaps a good point to discuss yet another interpretation of our results, that related to 
robustness of graph properties. The general approach of robustness., explicitly promoted in [10], 
suggests to investigate whether graph theoretic properties and statements typically remain valid 
under taking random subgraphs - which would then indicate they are robust under (massive) 
random deletions. For example, it is elementary to prove that any graph G of minimum degree at 
least k, k ^ 2, contains a cycle of length at least k + 1. Corollary 11.31 shows that this property 
typically stays with the random subgraph Gp of G, even when the edge probability p is only a 
notch above the critical probability p* = ^. Moreover, due to the remark above, if in addition G is 
assumed to be K-free, then not only G contains deterministically a cycle of length Q {n-u{k)), but 
the p-random subgraph Gp of G retains this property with probability exponentially (in k) close 
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to 1, even for p = Thus the property of containing long cycles is robust under taking random 
subgraphs. This complements in a substantial way the qualitative result of |10j . 

Even though the existence of a cycle of length ^ in a graph immediately implies the existence of 
a path of length i — 1, we give a separate, much shorter, argument to prove that if G is an Ti-fiee 
graph with minimum degree at least k and p ^ then with probability close to 1, the random 
graph Gp contains a path of length 6 ■ n-}{(k), see Theorem 11.51 below. Our proof of Theorem 11.51 is 
a fairly straightforward adaptation of the argument given in |llj . 



Theorem 1.5. Let e G (0, 1), let k be an integer, let H be a good family of graphs, and let £ be an 
integer satisfying 

ex{9e/e,n) k 

i ^ 4' ^'^^ 
If G is an T-L-free graph with minimum degree at least k and p ^ then 

( e^k \ 

Pr (Gp contains a path of length tj ^ 1 — 3exp ( "^300) ' 

Note the more explicit, as compared to Theorem 11.11 dependence of i on e. In particular, 
observe that when the family Ti is empty, then 1^ is satisfied when £ = e^fc/162, and hence 
Theorem 11.51 implies that with probability very close to 1, Gp contains a path of length e'^k/162. 
This is somewhat weaker than [11^ Theorem 4], which asserts that under the same assumptions, 
i.e., S{G) ^ k and p ^ with probability tending to 1 as /c — )■ 00, the random graph Gp contains 
a path of length but is optimal up to a constant factor as when p = then a.a.s. the longest 
path in G{k + l,p) has length at most 2e'^k, see, e.g., [9l Theorem 5.17]. 

At the heart of our proofs of Theorems 11.11 and 11.51 lies the analysis of the execution of the depth- 
first search algorithm on the random graph Gp. This approach to investigating the properties of 
random graphs near the threshold for the appearance of the giant component was considered in |11) . 
and our work draws heavily from there. 

The remainder of the paper is organized as follows. In Section [21 we introduce some notational 
conventions and list several auxiliary lemmas that we will refer to in the proofs of our main results. 
In Section [3l we describe the depth-first search algorithm and list some of its properties for later 
reference. Sections H] and [5] contain proofs of Theorems 11.51 and 11.11 respectively. We close with 
some concluding remarks and open problems, in Section [6l 



2. Preliminaries 

2.1. Notation. We use standard graph theoretic notation. In particular, given a graph G, we 
denote its vertex set by V{G) and the number of edges by e{G). Given a set A C V{G), we denote 
the subgraph of G induced by the set A by G[A]. For v £ V{G) and A C V{G), we denote the 
number of neighbors of v (the degree of v in G) and the number of neighbors of v in the set A 
by degg(f) and degQ^v, A), respectively. For two disjoint sets A,BC V{G), we write ea{A,B) to 
denote the number of edges of G with one endpoint in A and one endpoint in B. 

We shall now introduce the notion of excess edges, which will play a crucial role in the proof 
of Theorem 11.11 Suppose that G is an n-vertex graph with r connected components of sizes 
ni, . . . ,nr, respectively. Since each connected component contains a spanning tree, then clearly 
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e{G) ^ rii — 1 + . . . + rir — 1 = n — r. The number of excess edges of G, which we denote by 
excess(G), is the difference between e{G) and this trivial lower bound. In other words, we let 

excess(G) = e(G) — + ^connected components of G. (4) 

It is easy to verify that if G is a graph and C/ is a (not necessarily induced) subgraph of G, then 

excess(C/) ^ excess(G). (5) 

We will use this simple observation in the proof of our main result. 

Finally, let us remark that we will repeatedly omit rounding symbols whenever they are not 
crucial, and treat large numbers as integers. 

2.2. Tools. In our proofs, we will use the following standard estimate on tail probabilities of the 
binomial distribution, see, e.g., [21 Appendix A]. 

Lemma 2.1. Let n he a positive integer, let p G [0, 1], and let X ^ Bin(n,p). 
(i) (Chernoff's inequality) For every positive a with a ^ np/2, 

4np^ 

(a) For every positive k, 



a 

P{\X - np\ > a) < 2exp ( -- — 



P{X > Knp) ^ (^-j 

In the proof of Theorem II. H we will use the following simple estimates on the rate of growth of 
the Turan function. Lemmas 12.21 and 12.31 below. 

Lemma 2.2. Let % he an arhitrary family of graphs and let m and n be integers with n ^ m ^ 2. 
Then 

/n - 1 \^ 

ex(n,7^) ^ -exfm,?^). 

\m — I J 

Proof. Let G be an T-L-iiee graph with n vertices and ex(n, H) edges. The subgraph of G induced 
by any set of m vertices has at most ex(m, 71) edges and therefore, 

) ■ ex(m,T-L) ^ ( „ ) ■ex(n,7i), 
yTTi J \m — 2/ 

which easily implies the claimed inequality. □ 

Lemma 2.3. Let % he a good family of graphs and let m and n he integers with n ^ m ^ 2. Then 

ex(m,'H) ex{n,'H) 
m n 

Proof. The claimed inequality is an immediate consequence of the simple observation that, since 
each graph in the family H is connected, ex{n,'H) ^ [^J ■ ex{m,'H). □ 
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3. Depth-first search algorithm 

At the heart of our approach hes the depth-first search algorithm (DFS algorithm for short), 
which is a well-known graph exploration method. We briefly describe it below. 

The DFS algorithm takes as input a finite graph G on a vertex set V and, after visiting all 
vertices of G, outputs a rooted spanning forest F G such that the connected components of 
F are the connected components of G. Since we will be interested not only in the connectivity 
structure of G but also in its cycles, we make our algorithm examine all the edges of G, not only 
those that belong to F. Consequently, our version of the DFS algorithm runs in two phases. In the 
first phase, the algorithm discovers the connected components of G and constructs the spanning 
forest F. In the second phase, it examines the remaining edges of G, whose both endpoints lie in 
the same tree of F (connected component of G). 

At all times, the algorithm maintains a partition of V into three sets 5, T, and U. At any given 
moment, the set S contains vertices whose exploration is complete (i.e., whose neighborhood in F 
has been fully determined), T is the set of vertices that have not yet been visited, and U consists 
of vertices that are being explored. The vertices in U are kept in a stack, that is, a last-in-first-out 
data structure. The algorithm starts with T = V and S = U = % and, as it examines the edges of 
G, the vertices of G are moved from T to U and from U to S. The algorithm switches from the 
first to the second phase when S = V and U = T = 0. Finally, the procedure terminates when all 
the edges of G have been examined. 

We assume that the set V is equipped with some canonical linear order =<;. The first phase of 
the execution of the DFS algorithm can be divided into 2\V\ rounds. At the beginning of each 
round, the algorithm checks whether or not U is empty, li U = 0, then the =<;-smallest vertex of 
T is moved to U . Otherwise, the algorithm considers the top (most recently added) element v of 
U and queries whether {v, w} is an edge of G for some w £ T, examining the vertices of T from 
the ^^-smallest to the ^-largest. If v does have a neighbor w in T, then the =<;-smanest such w is 
moved from T to U, becoming its top element; no more queries about whether or not {v, w'} is an 
edge of G for some w' eT with w <w' are asked in this round. Otherwise, if the top element v of 
U has no neighbors in T, then v is moved to S. Finally, unless S = V , the algorithm proceeds to 
the next round. Observe that in each round exactly one vertex moves, either from T to U ox from 
U to S. Since in the end of the first phase all vertices have made their way from T to S, passing 
through U, the number of rounds is indeed 2\V\. 

At the end of the first phase, the DFS algorithm has constructed a rooted spanning forest F of 
the input graph G. The root of each tree in F is the first vertex of it that was moved from T to U . 
The edges of F are precisely those pairs {v, G G such that at some point during the execution 
of the algorithm, v was the top element of U and w was the ^-smallest neighbor of v in T. It is 
possible that at the end of the first phase, some pairs of vertices have not yet been queried. Note 
that if {v,w} is such a pair, then necessarily v and w belong to the same tree component of F 
found by the algorithm and, moreover, i; is a predecessor of w (or vice-versa) in this rooted tree, as 
otherwise the algorithm would have queried {v,w}. In particular, each such v and w are connected 
by a unique path in this tree. We denote the length of this path by X{v,w). In order to complete 
the exploration of all edges of G, in the second phase of its execution, the algorithm queries all the 
previously not queried pairs {v, u;}, ordered according to the value of \{v,w), from the smallest to 
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the largest. We break ties arbitrarily, that is, the pairs with the same value of A(-, •) are queried in 
an arbitrary order. 

Finally, we list several properties of the DFS algorithm for future reference: 

(1) The algorithm starts exploring a connected component C of G at the moment the ^-smallest 
vertex of C is moved into the (empty beforehand) set U and completes discovering C when 
U becomes empty again. At the moment when C is fully discovered, all of its vertices are 
in S. 

(2) If T 7^ 0, then every positively answered query increases the size of U by one. 

(3) At any stage, the algorithm has queried all pairs {v, with v ^ S and w € U Li T. 
Moreover, if w & T, then {v, w} ^ G. 

(4) The set U always spans a path in G. 

In several of our proofs, we will analyze the execution of the DFS algorithm on some (random) 
subgraph of a given graph G. We will often use the fact that our graph exploration algorithm 
implicitly defines a bijection ip between the set of all {0, l}-sequences of length e{G) and the family 
of all subgraphs of G, which pairs subgraphs with m edges with sequences containing exactly m 
ones. This bijection is defined as follows: Given a graph G' C G, we run the DFS algorithm with 
input G' . We start with (/?(G') being the empty sequence and each time the algorithm queries 
whether some {vjw} G G is an edge of G' , we append to ^{G') the answer to this query, i.e., 1 if 
{v, w} G G' and otherwise. We will sometimes say that G' is represented by the sequence ^p(G'). 

Conversely, any {0, l}-sequence (Xi) of length e{G) represents a subgraph G' of G that is de- 
scribed as follows. We run the DFS algorithm and each time it queries whether some pair {v, w} G G 
is an edge of G', we let {v, w} G C if = 1 and {v, w} ^ G' otherwise, where i is the number of the 
query (since the algorithm examines every edge of G exactly once, it asks precisely e{G) queries). 
In particular, if (Xi) is a sequence of i.i.d. Bernoulli random variables with success probability p, 
then G' is the random subgraph Gp of G. 

4. Proof of Theorem 11.51 

Let G be a finite H-bee graph with minimum degree at least k and assume that p ^ . Without 
loss of generality, we may assume that £ is the largest integer satisfying ([3]). (This quantity is well 
defined as implied by Remark 11.21 1 In particular, 

81{£ + l) ^ ex{9{£ + l)/£,n) ^ k 
2e2 ^ e+l ^4' 

and hence 

Note also that since the number n of vertices of G satisfies nk ^ 2 •ex(n, T-L), see ([1]), then n ^ 9£/e, 
as otherwise Lemma 12.31 and ([3]) would imply that 

ex(n, 'H) ^ 2 ex(9£/e,?^) ^ ek k 
n ^ 9£/e ^ 18 ^ 2' 

a contradiction. Let 

Q = ^M 

e 
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and note that Q < ^ ^ Consider the p-random {0, l}-sequence X of length e{G). We 

will show that if we run the DFS algorithm on the random subgraph of G represented by X, then 
with probability at least 1 — 3 exp(— e^/c/lSOO), after asking exactly Q queries, the set U from the 
description of the DFS algorithm contains at least i + 1 elements. Since the content of U forms a 
path in Gp, the assertion of the theorem will follow. 

Let us first give an easy upper bound on the size of S. We claim that 

ISl < ^. (7) 

If this is not true, then there was some point in the execution of the DFS algorithm, before more 
than Q queries have been asked, when the set S had exactly Gi/e elements. Since all edges between 
S and had been queried, we have 

Q ^ eiS, S^) = degiv) - 2e(5) ^ - • - 2 • ex{6i/e, n)^ — -^>Q, 
ves 

where the last two inequalities follow from and the assumption that e < 1. This is clearly a 
contradiction, which implies that ([7|) must hold. 

Let P and be the numbers of positive and negative answers to the first Q queries, respectively. 
That is, let 

P=\{ie [Q] : X^ = l}\ and N = Q - P. 

Since P ~ Bm{Q,p) and p ^ it follows from Chernoff's inequality (Lemma 12. ip that with 
probability at least 1 — 2exp(— e£/8), which, by ([6]), is at least 1 — 3exp(— e'^/c/lSOO), 

->On)f»(-l)f- <«) 

In the remainder of the proof, we show that ([5]) implies that \U\ > i. 

Let s and u be the sizes of the sets S and U, respectively, after exactly Q queries have been 
asked. Recalling the properties of the DFS algorithm and ([8]), we have 

s + u^ mm{P, n} > mm \ + 2) ~ \ ^ \ ^ 2) ~' 

Since u ^ es/2 and Q imply that u ^ 2i, we may assume from now on that u < es/2. 
Note that by ([7]) and our assumption that u < es/2 < s/2, 



N ^ e{S, T)=Y^ deg{v) - 2e{S) - e{S, U) ^ sk - 2e{S UU)^sk-2- ex(s + n, Ti) 

k£ 

^ sk-2- ex{3s/2,'H) ^ sk - 2 ■ ex(9^/e,?^) ^ sk - —, 



(10) 



where the last inequality follows from ([3|). It follows from (jlOp that 

s + u'^ mm{P, n} > mm <^ 1 + - i", — ^ ^ U+oP - U + 



2J ke } \ 2) \ 2 
and consequently, using the assumption that u < es/2 and ([9]), 

e\ / e\ es / e\2 £ e(s + u) / e\ „ / e 
^+2)^n^ + 2)y-(^ + 2) 2^^1-^-(l+2)^>(^+2 
This completes the proof. □ 
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5. Proof of Theorem 11.11 

5.1. Proof outline. Let us start by giving a brief outline of the proof. Suppose that G is an %- 
free graph with n vertices and minimum degree at least A;, for some sufficiently large integer A;, and 
that p ^ . The key step in the proof is to show that with high probability, the random graph 
Gp contains Vt{n) excess edges. To this end, we first prove that with probability 0(1), a positive 
proportion of the vertices of Gp lies in large connected components (Theorem 15. 2p . Second, we 
observe that it is extremely unlikely that there is a set A of n{n) vertices that form only large 
connected components in Gp but the subgraph of G induced by A has merely o{\A\ ■ k) edges. 
It then follows from a standard double exposure argument that e(Gp[^]) ^ (1 + r2(l))|^| and, 
consequently, with positive probability, Gp has J7(n) excess edges (Theorem 15. 3p . Finally, we note 
that the number of excess edges in a random graph is tightly concentrated around its expectation 
(Proposition [5T]) and therefore, excess(Gp) = Q.{n) with very high probability (Corollary 15. 4p . This 
means that when we run the DFS algorithm on a typical Gp, then the number of queries asked 
in the second phase of its execution is Q.{nk). Since the graph G is Ti-hee, which implies upper 
bounds on the densities of induced subgraphs of G, at least Q{k'^) of these queries are about edges 
{u,v} such that X{u,v) ^ £, where i satisfies With high probability, one of these pairs is an 
edge of Gp] this edge closes a cycle of length at least i in Gp. 

5.2. Bounding the number of excess edges. As mentioned above, the key ingredient in our 
proof of Theorem 11.11 is the fact that if G is a graph with minimum degree at least k and p ^ , 
then with high probability, the random graph Gp contains 0(|y(G)|) excess edges, see Corollary [531 
below. This statement will be an immediate consequence of the following two facts. First, using 
a martingale concentration result, we prove that for arbitrary graph G and probability p, the 
number of excess edges in the p-random subgraph of G is concentrated around its expectation, see 
Proposition 15. 1[ Second, we show that if 6{G) ^ k and p ^ then this expectation is at least 
c • |y(G)| for some positive constant c, see Theorem 15.31 

Proposition 5.1. Let G be an arbitrary graph and let p G [0, 1/2]. Let ^ be the expected number 
of excess edges in the random graph Gp. Then for every (3 G (0, 1], 

Pr(|excess(Gp) — ^ (3pe{G)) ^ 2exp 

Proof. Let m = e{G) and fix an arbitrary ordering ei, . . . , e™, of the edges of G. For each i G [m], 
let Xi be the indicator random variable of the event Cj S Gp. Fix an i G [m], let A be an arbitrary 
subset of {ei, . . . , ej_i}, and let Ai^a denote the event that Gp fl {ei, . . . , ej_i} = A. Following 
McDiarmid [l2], we define qi^a - {0, 1} N by 

fl'j,A(a;) = E[excess(Gp) | Ai^A f\ Xi = x] — E[excess(Gp) | 

The function gi^A measures how much the expected number of excess edges in Gp changes when it is 
revealed whether ej is or is not an edge of Gp. Observe crucially that the function excess(-) is edge 
Lipschitz, i.e., adding or deleting a single edge to/from a graph changes the number of excess edges 
by at most one. It follows that |gj^^(l) — g'i,A(0)| ^ 1 for all i and A. Applying [121 Theorem 3.9] 



/3V(G) 
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to the sequence (Xj) with / = excess and t = j3pe{G), since h = maxdev ^ 1 and ^ 



Our proof of Theorem 15.31 will use the following fact, which is implicit in many earlier works on 
the phase transition in G{n,p), see, e.g., [3l[9]. Our proof here, which we include for the sake of 
completeness, follows the approach of [11]. 

Theorem 5.2. Let e G (0, 1/3), let k he an arbitrary integer, let G he a graph with minimum degree 
at least k, and let v he an arbitrary vertex of G. If p ^ then 

Pr [the connected component of v in Gp has at least ek/2 vertices^ > e/6. 

Proof. Let G be a graph with minimum degree at least k, let v be an arbitrary vertex of G, and fix 
some arbitrary linear order ^ on 1/ whose smallest element is v. Assume that p ^ and consider 
the p-random {0, l}-sequence X of length e{G). We will show that if we run the DFS algorithm 
on the random subgraph of G represented by X, then with probability at least e/6, the following 
event holds: 

A: From the moment v is moved to U, the set U does not become empty until \S\ ek/2. 

This clearly implies the assertion of the theorem. 

With the aim of estimating the probability of A, let us define for each i € [e(G)], 

((I -e/2) -k iiXi = l, , , , 

Yi = \^ ' ' and Zi = (\- e/2 • A; + V Y^. 

\-i ifx, = o, ^ ^ ' ^ fri ' 

We claim that if > for all i ^ e(G), then A holds. Indeed, suppose that the sequence X is such 
that Zi > for all i and consider the execution of the DFS algorithm on the random subgraph of G 
defined by X. Suppose that A does not hold, that is, the set U becomes empty when \S\ < ek/2. 
Recall that initially U contains one element (the vertex v) and the size of U increases by one every 
time Xi = 1 for some i. Moreover, by the time a vertex w is moved from U to S, the DFS algorithm 
has queried all the deg{w,T) edges of G that connect w and T and got a negative answer for each 
of those queries. Finally, note that clearly degQ{w, T) k—\S\ — \U\ ^ (1 — e/2) • k. In particular, 
the algorithm gets at least {l — e/2)k negative answers to queries about pairs that contain w before 
moving w from U to S. These three facts readily imply that if A does not hold, then Zi ^ for 
some i. 

Finally, we estimate the probability that Zi > for all i. To this end, let a be the smallest 
positive real that satisfies 

f{a) := p ■ a(i-"/2)-fc + _ ^) . ^-i = ^ 
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and note that a ^ e 4fc as \im^_^Q+ f(a) = oo and, recalling that e < 1/3 and that ^ 1 + x + 
if |x| ^ 1, 



4 4 y 4A: V 4fc y 

Now, let Zq = {1 — e/2)k and for each i with ^ i ^ let Mj = Since a satisfies (jlip . 

the sequence (Mj) is easily seen to be a martingale. Furthermore, let 

T = min{z : ^ 0} and M* = M^at 

and observe that the sequence (M*) is also a martingale, as T is a stopping time with respect to 
(Mi). Therefore, 

E[M*^^^] = E[Mo*] = E[Afo] = E[a^°] = a^^'^)'' ^ e5(^~i). 
On the other hand, since a < 1 and therefore M*^^-^ ^ 1 whenever Zj ^ for some i, then 

^ Pr(Zi ^ for some i). 
It follows that, recalling again that ^ 1 + x + if [x| ^ 1, 

2 

Pr(Z, > for all i) ^ 1 - ^-fl--^-— >-. □ 

^ ' 4 V 2/ 16 6 

Theorem 5.3. For every positive e, there exists a positive constant c such that the following holds. 
Let k be a sufficiently large integer and let G be an n-vertex graph with minimum degree at least k. 
If P ^ ^~^! then the expected number of excess edges in Gp is at least cn. 

Proof. Let us first choose some parameters. Let 

e 7^ 
7 = — and 6 = — j, 
' 24 4e4 ' 

and assume that ^ Without loss of generality, we may also assume that e ^ §• Let G be a 
graph with minimum degree at least k and suppose that p ^ We will expose the edges of Gp 
in two rounds. To this end, let pi = let p2 satisfy (1 —pi){l — P2) = 1 — P, and observe that 

P2 ^ P — Pi ^ Consider the following two events in Gp^ : 

A: at least -yn vertices of Gp-^ lie in components of size at least efc/4. 

B: some B C V{G) with \B\ ^ 771 and e{G[B]) ^ d\B\k satisfies e{Gp^[B]) ^ \B\/2. 

Suppose now that A holds. In this case, there is a set A C V{G) of at least 771 vertices such that 
each connected component of Gp^ [j4] has at least e/c/4 vertices and therefore, 



12 



MICHAEL KRIVELEVICH AND WOJCIECH SAMOTIJ 



Assume furthermore that B does not hold. Then necessarily e(G[^]) > (5|yl|A;. Observe that 
each edge of G \ Gp^ belongs to Gp^ with probability p2- Hence, considering separately the cases 
e(GpJA]) ^ §\^h. and e(GpJy4]) < we obtain the following bound: 

E [e{Gp[A]) \A^^B]^¥. [e{Gp, [A]) + ■ {e{G[A\) - e{Gp, [A])) \AA^B] 



^ . (6\A\k / 4 \ , ^, 6\A\k 



(12) 



Consequently, conditioned on A and ^B holding simultaneously, since excess(Gp) ^ excess(Gp[A]) 
by ([5]), the expected number of excess edges in Gp satisfies 

E[excess(Gp)] ^ E [excess (Gp [A])] ^ —n. 

8 

It is therefore enough to prove the following claim. 

Claim. Pr(^ A^^) ^ 7/2. 

Let L be the number of vertices of Gp^ that lie in components of size at least ek/i. It follows from 
Theorem 15. 21 that E[L] ^ 2-yn and hence, Pr(^) ^ 7, as clearly L ^ n. With the aim of estimating 
the probability of B, fix an arbitrary set B of b vertices, where b ^ jn, such that e{G[B]) ^ 6bk. 
It follows from Lemma |2. II that 

Pr(e(GpjB]) > 6/2) ^ Pr(Bin(56A;, 2/A;) ^ 6/2) ^ (Aed)^/^ . 

Consequently, 

pr(8) < E (:) i^^'f' < E (t)' < E f^)'" = E ^-"'^ < 7/2. 

b^fn b^-yn b'^'yn b^'yn 

since 777, ^ 7 A; ^ 4/7. This completes the proof of the claim and consequently, the proof of the 
theorem. □ 

Corollary 5.4. For every positive e, there exists a positive constant c such that the following holds. 
Let k be a sufficiently large integer and let G be an n-vertex graph with minimum degree at least k. 
If p ^ ^T^' t/ien with probability at least 1 — 2 exp(— c^n/12), the number of excess edges in Gp is 
at least cn. 

Proof. Let c = minjl, qg^ (£)/2} and suppose that k is sufficiently large, so that the assertion of 
Theorem 15.31 is true. Without loss of generality, we may assume that pe{G) ^ An as otherwise 
by Chernoff's inequality (Lemma 12. ip . e{Gp) ^ 2n with probability at least 1 — exp(— n/4), and 
consequently 

excess(Gp) ^ ^{Gp) — cn. 
It follows from Theorem 15.31 that E[excess(Gp)] ^ 2cn and therefore by Proposition 15.11 with /3 = 



cn \ ^ I (?n 



pe{G) ' 

Pr(excess(Gp) < cn) ^ 2exp ( - ■ cn] ^ 2exp 

V ^pe{G) J V 12 

as pe{G) ^ 4n by our assumption. □ 
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5.3. Finding a long cycle. Let G be an T-i-iree graph with minimum degree at least k and assume 
that p ^ Let c = ({52]/3, let 5 = c^/21, and suppose that k is sufficiently large, so that, in 

particular, the assertion of Corollary 15.41 holds. Without loss of generality, we may assume that 
e ^ 1 and that I is the largest integer satisfying ([2]) with 6 replaced by c/2 (as c/2 > (5, we will be 
proving a stronger statement). In particular, 

9(^ + 1) ex(3(£ + 1),?^) ck 
2 ^ T+l ^ Y' 
and hence £ ^ c/c/lO. Consider the p-random {0, l}-sequence X of length e{G). Recall the definition 
of A(-, •) from Section [3l We will show that if we run the DFS algorithm on the random subgraph of 
G represented by X, then with probability at least 1 — 4exp(— c^A;/16), the following event holds: 
A: At the moment when the DFS algorithm finishes discovering all the connected com- 
ponents, i.e., when S = V^ the number of edges G G such that {v^w} has not 
been yet queried and \{v^w) ^ ^ is at least ckl/2. 
This will clearly be sufficient, since it implies that each of the last ckijl edges of G that are queried 
by our graph exploration algorithm closes a cycle of length at least I and with probability at least 
1 — (1 —pY^^/"^^ one of those queries will be answered positively. Since, as noted above, I ^ c/c/10, 
this probability is at least 1 — exp(c^A;/20). 

Let n be the number of vertices of G. It now suffices to prove the following claim. 

Claim. The event A contains the intersection of the following two events: 

(i) The number of excess edges in Gp is at least 3cn. 

(ii) There are fewer than 3cn indices i with i > e{G) — ckn such that Xi = 1. 

To see that the above claim implies the claimed bound on the probability that Gp contains a 
cycle of length at least d., note that n ^ k + 1, that, by Corollary 15.41 (ji]) holds with probability at 
least 1 — 2 exp(— c^n/2), and that, by Chernoff's inequality (Lemma 12. ^ holds with probability 
at least 1 — 2 exp(— cn/16). 

We now prove the claim. Let G' Q G consist of pairs (edges of G) that have not been queried at 
the time when our graph exploration algorithm finishes discovering the connected components of 
Gp. Note that (ji]) and (jn]) imply that e(G') ^ ckn as the rooted spanning forest F of Gp constructed 
by the DFS algorithm contains no excess edges. Recall that each edge of G' connects a vertex w 
with its predecessor v in one of the rooted trees forming F and that X{v, w) denotes the length of 
the unique path joining v and w in this tree. Since the average degree of G' is at least 2ck and the 
endpoints of each edge of G' lie in the same connected component of F, there must be a rooted 
tree R in F such that the average degree of G'[y(i?)] is at least 2ck. We claim that there are at 
least cke/2 edges {v,w} £ G'[V{R)] such that X{v,w) ^ i. 

Let, for every subtree R* of R, 

d{R*) = \{v G V{R) : V is a, predecessor of w in i? and {v, w} G G'}\ 

and note that d{R) = e{G'[V{R)]) ^ ck\V{R)\. We claim that R contains a subtree R* with 
£ ^ |1^(-R*)| < 2-£ such that d{R*) ^ ck£. Indeed, we can construct such R* as follows. First, 
repeatedly delete from R every full subtre^ R' such that d{R') < ck\V{R')\ until there are no such 

subtree of a rooted tree that is induced by some vertex and all of its descendants. 
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R' left. Since as a result of every such deletion, the ratio d{R)/\V{R)\ increases, we may assume 
that R contains no such R' . Observe that R still has at least £ vertices, as G'\V{R)] is an "H-free 
graph with average degree at least 2ck, cf. ([2]) and Lemma 12.31 We can obtain an R* with the 
claimed property by running the following simple recursive procedure on R: If R has fewer than 211. 
vertices, then we let R* = R. Otherwise, if the full subtree R' rooted at one of the children of the 
root of R has at least i vertices, then we let R = R' and work with R', noting that d{R') ^ ck\V{R')\ 
by our assumption on the original tree. Else, R has at least 2i vertices but each full subtree of R 
attached to its root has fewer than £ vertices. In this case, we can easily obtain a subtree R* of R 
that has the required properties by deleting some of these full subtrees. 

Now, let R'^ be the subtree of R obtained from R* by adding to it the first (at most) i vertices 
on the unique path connecting the root of R* to the root of R. By definition, if {v, w} is an edge 
of G' such that X{v,w) < £ and v is a predecessor of w, then w £ V{R*) implies that v £ V{R'^). 
It follows that 

\{{v,w} G G': X{v,w) ^ £}\ > d{R*) - e{G'[V{R+)]) ^ ck£-ex{^£,n) ^ ck£/2. 
To see the last inequality, note that by ([2]) and Lemma l2.2| 

ex{3£,n) ^ 10 • e^{£,-H) ^ W6£k ^ ck£/2. 
This completes the proof of the claim and consequently, the proof of the theorem. □ 

6. Concluding remarks 

6.1. Dependence of £ on e. In our proofs of Theorems 11.11 and 11.51 we made no attempts to 
optimize the dependence of J on e. A careful analysis of the proof of Theorem 11.11 shows that 5 
has only polynomial dependence on e and hence when T-L is finite, then condition ([2]) is satisfied for 
some £ = Q{e°^'^ ■ ny_{k)), where a-^ is a constant depending only onV.. It would be interesting to 
know whether one can replace this with 2, at least when 7i is the empty family, as in the case 
of G{n,p), which, if p ^ a.a.s. contains a cycle of length Q{e'^n). 

6.2. Regular graphs. The proof of Theorem 11.11 can be somewhat simplified (at least conceptu- 
ally) when one makes the assumption that the graph G is regular. Recall that the key step in our 
proof is establishing that with high probability, the graph Gp contains il.{n) excess edges. If G is reg- 
ular, then using the approach of [7], one can obtain a lower bound on the expected number of small 
tree components in Gp that is sufficiently strong to imply, via (jH), that E[excess(Gp)] ^ e^n/12. 
Proposition 15.11 will then imply the statement of Corollarv 15.41 with c replaced by e^/20, which is 
sufficient for the proof of Theorem 11.11 

Moreover, following the approach of [7j, one may also show that for some absolute constant C, 
with high probability only at most Gen vertices of Gp lie in connected components that are not 
small trees. Consequently, in the proof of Theorem ll.il we may find a rooted tree R such that the 
average degree of G'{V{R)] is at least ■ k. This is enough to show that with high probability, 

2 

Gp contains a cycle of length at least ■ k. 
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6.3. Graphs with large average degree. It is natural to ask whether Theorem 11.11 is still true 
when one replaces the assumption that the minimum degree of G is at least k with the much weaker 
assumption that the average degree of G is at least k. Since each graph G with average degree k 
contains a subgraph with minimum degree exceeding one can easily deduce such statement if 
one strengthens the assumption on p to p ^ The following easy argument shows that this is 
not optimal. 

Assume that G has n vertices and average degree k. For every p G [0,1], since the function 
(0, oo) 9 X 1-^ (1 —pY is convex, the expected number of isolated vertices in Gp is at least (1 —p)'^-n. 
Let Co be the positive solution of the equation | — 1 + e~'^ = and observe that cq ~ 1.6. Using (jU, 
we see that if c > cq and k is sufficiently large, then E[excess(Gp)] = 0(n). It follows that the 
assumption on p can be weakened to p ^ ^^T^- believe that similarly as in the minimum degree 
case, the assumption p ^ is sufficient, but at the moment we are unable to establish this claim. 
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